Gulf of Guinea
- Atlantic Ocean > South Atlantic Ocean > Gulf of Guinea (0.08)
- Africa > Gulf of Guinea (0.08)
- Europe > Norway (0.07)
- (2 more...)
Learning to Interpret Weight Differences in Language Models
Goel, Avichal, Kim, Yoon, Shavit, Nir, Wang, Tony T.
Finetuning (pretrained) language models is a standard approach for updating their internal parametric knowledge and specializing them to new tasks and domains. However, the corresponding model weight changes ("weight diffs") are not generally interpretable. While inspecting the finetuning dataset can give a sense of how the model might have changed, these datasets are often not publicly available or are too large to work with directly. Towards the goal of comprehensively understanding weight diffs in natural language, we introduce Diff Interpretation Tuning (DIT), a method that trains models to describe their own finetuning-induced modifications. Our approach uses synthetic, labeled weight diffs to train a DIT-adapter, which can be applied to a compatible finetuned model to make it describe how it has changed. We demonstrate in two proof-of-concept settings (reporting hidden behaviors and summarizing finetuned knowledge) that our method enables models to describe their finetuning-induced modifications using accurate natural language descriptions.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Europe > Russia (0.04)
- Asia > Russia (0.04)
- (5 more...)
- Media > Music (0.46)
- Leisure & Entertainment > Sports (0.46)
- Leisure & Entertainment > Games (0.46)
Real-time, Adaptive Radiological Anomaly Detection and Isotope Identification Using Non-negative Matrix Factorization
Jones, Chandler, Bandstra, Mark, Faaland, Stefan, Lai, Yue Shi, Abgrall, Nico, Suchyta, Scott, Cooper, Reynold
Spectroscopic anomaly detection and isotope identification algorithms are integral components in nuclear nonproliferation applications such as search operations. The task is especially challenging in the case of mobile detector systems due to the fact that the observed gamma-ray background changes more than for a static detector system, and a pretrained background model can easily find itself out of domain. The result is that algorithms may exceed their intended false alarm rate, or sacrifice detection sensitivity in order to maintain the desired false alarm rate. Non-negative matrix factorization (NMF) has been shown to be a powerful tool for spectral anomaly detection and identification, but, like many similar algorithms that rely on data-driven background models, in its conventional implementation it is unable to update in real time to account for environmental changes that affect the background spectroscopic signature. We have developed a novel NMF-based algorithm that periodically updates its background model to accommodate changing environmental conditions. The Adaptive NMF algorithm involves fewer assumptions about its environment, making it more generalizable than existing NMF-based methods while maintaining or exceeding detection performance on simulated and real-world datasets.
- North America > United States > Tennessee > Anderson County > Oak Ridge (0.04)
- North America > United States > New Mexico (0.04)
- North America > United States > District of Columbia > Washington (0.04)
- (9 more...)
- Health & Medicine (1.00)
- Energy (1.00)
- Government > Regional Government > North America Government > United States Government (0.93)
- Government > Military (0.68)
- Atlantic Ocean > South Atlantic Ocean > Gulf of Guinea (0.08)
- Africa > Gulf of Guinea (0.08)
- Europe > Norway (0.07)
- (2 more...)
Ev2R: Evaluating Evidence Retrieval in Automated Fact-Checking
Akhtar, Mubashara, Schlichtkrull, Michael, Vlachos, Andreas
Current automated fact-checking (AFC) approaches commonly evaluate evidence either implicitly via the predicted verdicts or by comparing retrieved evidence with a predefined closed knowledge source, such as Wikipedia. However, these methods suffer from limitations, resulting from their reliance on evaluation metrics developed for different purposes and constraints imposed by closed knowledge sources. Recent advances in natural language generation (NLG) evaluation offer new possibilities for evidence assessment. In this work, we introduce Ev2R, an evaluation framework for AFC that comprises three types of approaches for evidence evaluation: reference-based, proxy-reference, and reference-less. We evaluate their effectiveness through agreement with human ratings and adversarial tests, and demonstrate that prompt-based scorers, particularly those leveraging LLMs and reference evidence, outperform traditional evaluation approaches.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Europe > United Kingdom > Scotland (0.05)
- Atlantic Ocean > South Atlantic Ocean > Gulf of Guinea > Niger Delta (0.04)
- (35 more...)
- Government (1.00)
- Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.46)
- Health & Medicine > Therapeutic Area > Immunology (0.46)
Long-Context LLMs Meet RAG: Overcoming Challenges for Long Inputs in RAG
Jin, Bowen, Yoon, Jinsung, Han, Jiawei, Arik, Sercan O.
Retrieval-augmented generation (RAG) empowers large language models (LLMs) to utilize external knowledge sources. The increasing capacity of LLMs to process longer input sequences opens up avenues for providing more retrieved information, to potentially enhance the quality of generated outputs. It is plausible to assume that a larger retrieval set would contain more relevant information (higher recall), that might result in improved performance. However, our empirical findings demonstrate that for many long-context LLMs, the quality of generated output initially improves first, but then subsequently declines as the number of retrieved passages increases. This paper investigates this phenomenon, identifying the detrimental impact of retrieved "hard negatives" as a key contributor. To mitigate this and enhance the robustness of long-context LLM-based RAG, we propose both training-free and training-based approaches. We first showcase the effectiveness of retrieval reordering as a simple yet powerful training-free optimization. Furthermore, we explore training-based methods, specifically RAG-specific implicit LLM fine-tuning and RAG-oriented fine-tuning with intermediate reasoning, demonstrating their capacity for substantial performance gains. Finally, we conduct a systematic analysis of design choices for these training-based methods, including data distribution, retriever selection, and training context length.
- North America > Puerto Rico (0.14)
- Africa > West Africa (0.04)
- North America > United States > Illinois (0.04)
- (14 more...)
- Research Report > New Finding (1.00)
- Overview (1.00)
- Leisure & Entertainment (1.00)
- Government (1.00)
- Media > Film (0.67)
- Education (0.66)
Could AI save Nigerians from devastating floods?
In the small village of Ogba-Ojibo in central Nigeria, sitting at the confluence of two of the nation's largest rivers – the Niger and Benue – 27-year-old Ako Prince Omali is counting the steps carved out of the dirt, which lead down the loam-coloured banks of the river Niger. This river bank, dotted with tufts of spiky grass, is where villagers come to fish or wash produce and laundry. Just last week, three of the steps were submerged during one night of rain, which raised the water level by about five metres. Normally, you can count seven steps down into the river. Now, only four remain above the surface of the water, the sticks bracing the muddy steps having washed away in the deluge.
- Africa > Nigeria > Kogi State (0.06)
- North America > United States > New York (0.04)
- North America > Puerto Rico (0.04)
- (8 more...)
Machine learning models for daily rainfall forecasting in Northern Tropical Africa using tropical wave predictors
Satheesh, Athul Rasheeda, Knippertz, Peter, Fink, Andreas H.
Numerical weather prediction (NWP) models often underperform compared to simpler climatology-based precipitation forecasts in northern tropical Africa, even after statistical postprocessing. AI-based forecasting models show promise but have avoided precipitation due to its complexity. Synoptic-scale forcings like African easterly waves and other tropical waves (TWs) are important for predictability in tropical Africa, yet their value for predicting daily rainfall remains unexplored. This study uses two machine-learning models--gamma regression and a convolutional neural network (CNN)--trained on TW predictors from satellite-based GPM IMERG data to predict daily rainfall during the July-September monsoon season. Predictor variables are derived from the local amplitude and phase information of seven TW from the target and up-and-downstream neighboring grids at 1-degree spatial resolution. The ML models are combined with Easy Uncertainty Quantification (EasyUQ) to generate calibrated probabilistic forecasts and are compared with three benchmarks: Extended Probabilistic Climatology (EPC15), ECMWF operational ensemble forecast (ENS), and a probabilistic forecast from the ENS control member using EasyUQ (CTRL EasyUQ). The study finds that downstream predictor variables offer the highest predictability, with downstream tropical depression (TD)-type wave-based predictors being most important. Other waves like mixed-Rossby gravity (MRG), Kelvin, and inertio-gravity waves also contribute significantly but show regional preferences. ENS forecasts exhibit poor skill due to miscalibration. CTRL EasyUQ shows improvement over ENS and marginal enhancement over EPC15. Both gamma regression and CNN forecasts significantly outperform benchmarks in tropical Africa. This study highlights the potential of ML models trained on TW-based predictors to improve daily precipitation forecasts in tropical Africa.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Africa > Central Africa (0.05)
- Africa > Niger > Niamey > Niamey (0.05)
- (18 more...)
- Research Report > New Finding (0.46)
- Research Report > Experimental Study (0.46)
Guylingo: The Republic of Guyana Creole Corpora
Clarke, Christopher, Daynauth, Roland, Wilkinson, Charlene, Devonish, Hubert, Mars, Jason
While major languages often enjoy substantial attention and resources, the linguistic diversity across the globe encompasses a multitude of smaller, indigenous, and regional languages that lack the same level of computational support. One such region is the Caribbean. While commonly labeled as "English speaking", the ex-British Caribbean region consists of a myriad of Creole languages thriving alongside English. In this paper, we present Guylingo: a comprehensive corpus designed for advancing NLP research in the domain of Creolese (Guyanese English-lexicon Creole), the most widely spoken language in the culturally rich nation of Guyana. We first outline our framework for gathering and digitizing this diverse corpus, inclusive of colloquial expressions, idioms, and regional variations in a low-resource language. We then demonstrate the challenges of training and evaluating NLP models for machine translation in Creole. Lastly, we discuss the unique opportunities presented by recent NLP advancements for accelerating the formal adoption of Creole languages as official languages in the Caribbean.
- North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
- Atlantic Ocean > South Atlantic Ocean > Gulf of Guinea (0.04)
- Africa > Gulf of Guinea (0.04)
- (10 more...)
- Education (0.68)
- Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.46)
- Health & Medicine > Therapeutic Area > Immunology (0.46)
- Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.70)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.50)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.50)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.31)
NaijaHate: Evaluating Hate Speech Detection on Nigerian Twitter Using Representative Data
Tonneau, Manuel, de Castro, Pedro Vitor Quinta, Lasri, Karim, Farouq, Ibrahim, Subramanian, Lakshminarayanan, Orozco-Olvera, Victor, Fraiberger, Samuel P.
To address the global issue of online hate, hate speech detection (HSD) systems are typically developed on datasets from the United States, thereby failing to generalize to English dialects from the Majority World. Furthermore, HSD models are often evaluated on non-representative samples, raising concerns about overestimating model performance in real-world settings. In this work, we introduce NaijaHate, the first dataset annotated for HSD which contains a representative sample of Nigerian tweets. We demonstrate that HSD evaluated on biased datasets traditionally used in the literature consistently overestimates real-world performance by at least two-fold. We then propose NaijaXLM-T, a pretrained model tailored to the Nigerian Twitter context, and establish the key role played by domain-adaptive pretraining and finetuning in maximizing HSD performance. Finally, owing to the modest performance of HSD systems in real-world conditions, we find that content moderators would need to review about ten thousand Nigerian tweets flagged as hateful daily to moderate 60% of all hateful content, highlighting the challenges of moderating hate speech at scale as social media usage continues to grow globally. Taken together, these results pave the way towards robust HSD systems and a better protection of social media users from hateful content in low-resource settings.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > Dominican Republic (0.04)
- Oceania > Australia (0.04)
- (15 more...)
- Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
- Information Technology > Security & Privacy (1.00)
- Law (0.67)
- Government (0.67)